Modeling phonetic context with non-random forests for speech recognition
نویسندگان
چکیده
Modern speech recognition systems typically cluster triphone phonetic contexts using decision trees. In this paper we describe a way to build multiple complementary decision trees from the same data, for the purpose of system combination. We do this by jointly building the decision trees using an objective function that has an added entropy term to encourage diversity among the decision trees. After the trees are built, the systems are built in the standard way and the emission probabilities are combined during decoding. Experiments on multiple datasets show gains from the use of multiple trees, at the expense of evaluating multiple models in test time.
منابع مشابه
Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملMultiple-State Context-Dependent Phonetic Modeling with MLP
arlier hybrid multilayer perceptron (MLP)/hidden Markov model (HMM) continuous speech recognition sysr g tems have not modeled context-dependent phonetic effects, sequences of distributions for phonetic models, o ender-based speech consistencies. In this paper we present a new MLP architecture and training procedure for t " modeling context-dependent phonetic classes with a sequence of distribu...
متن کاملModeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese
We study the problem of phonetic modeling for continuous Mandarin speech recognition by providing a systematic performance comparison for systems based on following primitive speech units: syllable, demi-syllable (Initials and Finals), context-independent phones, left-or-right context-dependentphones (diphones), and leftand-right context-dependent phones (triphones). In our speakerdependent con...
متن کاملRandom Forests In Language Modeling
In this paper, we explore the use of Random Forests (RFs) (Amit and Geman, 1997; Breiman, 2001) in language modeling, the problem of predicting the next word based on words already seen before. The goal in this work is to develop a new language modeling approach based on randomly grown Decision Trees (DTs) and apply it to automatic speech recognition. We study our RF approach in the context of ...
متن کامل